Phrase Pair Rescoring with Term Weighting for Statistical Machine Translatio
نویسندگان
چکیده
We propose to score phrase translation pairs for statistical machine translation using term weight based models. These models employ tf.idf to encode the weights of content and non-content words in phrase translation pairs. The translation probability is then modeled by similarity functions defined in a vector space. Two similarity functions are compared. Using these models in a statistical machine translation task shows significant improvements.
منابع مشابه
Phrase Pair Rescoring With Term Weighting For Statistical Machine Translation
We propose to score phrase translation pairs for statistical machine translation using term weight based models. These models employ tf.idf to encode the weights of content and non-content words in phrase translation pairs. The translation probability is then modeled by similarity functions defined in a vector space. Two similarity functions are compared. Using these models in a statistical mac...
متن کاملFeature-Rich Discriminative Phrase Rescoring for SMT
This paper proposes a new approach to phrase rescoring for statistical machine translation (SMT). A set of novel features capturing the translingual equivalence between a source and a target phrase pair are introduced. These features are combined with linear regression model and neural network to predict the quality score of the phrase translation pair. These phrase scores are used to discrimin...
متن کاملA simple and effective weighted phrase extraction for machine translation adaptation
The task of domain-adaptation attempts to exploit data mainly drawn from one domain (e.g. news) to maximize the performance on the test domain (e.g. weblogs). In previous work, weighting the training instances was used for filtering dissimilar data. We extend this by incorporating the weights directly into the standard phrase training procedure of statistical machine translation (SMT). This all...
متن کاملThe ISL Phrase-Based MT System for the 2007 ACL Workshop on Statistical Machine Translation
In this paper we describe the Interactive Systems Laboratories (ISL) phrase-based machine translation system used in the shared task ”Machine Translation for European Languages” of the ACL 2007 Workshop on Statistical Machine Translation. We present results for a system combination of the ISL syntax-augmented MT system and the ISL phrase-based system by combining and rescoring the n-best lists ...
متن کاملBilingual Structured Language Models for Statistical Machine Translation
This paper describes a novel target-side syntactic language model for phrase-based statistical machine translation, bilingual structured language model. Our approach represents a new way to adapt structured language models (Chelba and Jelinek, 2000) to statistical machine translation, and a first attempt to adapt them to phrasebased statistical machine translation. We propose a number of variat...
متن کامل